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Description 

[0001] This invention relates to methods of detecting and cloning of individual mRNAs. 

[0002] The activities of genes in cells are reflected in the kinds and quantities of their mRNA and protein species. 

5 Gene expression is crucial for processes such as aging, development, differentiation, metabolite production, progres- 
sion of the cell cycle, and infectious or genetic or other disease states. Identification of the expressed mRNAs will be 
valuable for the elucidation of their molecular mechanisms, and for applications to the above processes. 
[0003] Mammalian cells contain approximately 1 5,000 different mRNA sequences, however, each mRNA sequence 
is present at a different frequency within the cell. Generally, mRNAs are expressed at one of three levels. A few "abun- 

^0 dant" mRNAs are present at about 10,000 copies per cell, about 3,000-4,000 "intermediate" mRNAs are present at 
300-500 copies per cell, and about 11 ,000 "low-abundance" or "rare" mRNAs are present at approximately 15 copies 
per cell. The numerous genes that are represented by intermediate and tow frequencies of their mRNAs can be cloned 
by a variety of well established techniques (see for example Sambrook et al., 1989, Molecular Cloning: A Laboratory 
Manual, Second Edition, Cold Spring Harbor Press, pp. 8.6-8.35). 

15 [0004] If some knowledge of the gene sequence or protein is had, several direct cloning methods are available. 
However, if the identity of the desired gene is unknown one must be able to select or enrich for the desired gene product 
in order to identify the "unknown" gene without expending large amounts of time and resources. 
[0005] The identification of unknown genes can often involve the use of subtractive or differential hybridization tech- 
niques. Subtractive hybridization techniques rely upon the use of very closely related cell populations, such that dif- 

20 ferences in gene expression will primarily represent the gene(s) of interest. A key element of the subtractive hybridi- 
zation technique is the construction of a comprehensive complementary-DNA ("cDNA") library. 
[0006] The construction of a comprehensive cDNA library is now a fairiy routine procedure. PolyA mRNA is prepared 
from the desired cells and the first strand of the cDNA is synthesized using RNA-dependent DNA polymerase ("reverse 
transcriptase") and an oligodeoxynucleotide primer of 12 to 18 thymidine residues. The second stand of the cDNA is 

25 synthesized by one of several methods, the more efficient of which are commonly known as "replacement synthesis" 
and "primed synthesis". 

[0007] Replacement synthesis involves the use of ribonuclease H ("RNAase H"), which cleaves the phosphodiester 
backbone of RNA that is in a RNA:DNA hybrid leaving a 3' hydroxyl and a 5' phosphate, to produce nicks and gaps in 
the mRNA strand, creating a series of RNA primers that are used by E. coli DNA polymerase I, or its "Klenow" fragment, 
30 to synthesize the second strand of the cDNA. This reaction is very efficient; however, the cDNAs produced most often 
lack the 5' terminus of the mRNA sequence. 

[0008] Primed synthesis to generate the second cDNA strand is a general name for several methods which are more 
difficult than replacement synthesis yet clone the 5' terminal sequences with high efficiency. In general, after the syn- 
thesis of the first cDNA strand, the 3' end of the cDNA strand is extended with terminal transferase, an enzyme which 
35 adds a homopolymeric "tail" of deoxynucleotides, most commonly deoxycyttdylate. This tail is then hybridized to a 
primer of oligodeoxyguanidylate or a synthetic fragment of DNA with an deoxyguanidylate tail and the second strand 
of the cDNA is synthesized using a DNA-dependent DNA polymerase. 

[0009] The primed synthesis method is effective, but the method is laborious, and all resultant cDNA clones have a 
tract of deoxyguanidylate immediately upstream of the mRNA sequence. This deoxyguanidylate tract can interfere with 
^0 transcription of the DNA in vitro or in vivo and can interfere with the sequencing of the clones by the Sanger dideoxy- 
nucleotide sequencing method. 

[0010] Once both cDNA strands have been synthesized, the cDNA library is constructed by cloning the cDNAs into 
an appropriate plasmid or viral vector. In practice this can be done by directly ligating the blunt ends of the cDNAs into 
a vector which has been digested by a restriction endonuclease to produce blunt ends. Blunt end ligations are very 
45 inefficient, however, and this is not a common method of choice. A generally used method involves adding synthetic 
linkers or adapters containing restriction endonuclease recognition sequences to the ends of the cDNAs. The cDNAs 
can then be cloned into the desired vector at a greater efficiency. 

[0011] Once a comprehensive cDNA library is constructed from a cell line, desired genes can be identified with the 
assistance of subtractive hybridization (see for example Sargent T.D., 1987, Meth. Enzymoi., Vol. 152, pp. 423-432; 

50 Lee et a/., 1991, Proc. Natl. Acad. Sci„ USA, Vol. 88, pp. 2825-2830). A general method for subtractive hybridization 
is as follows. The complementary strand of the cDNA is synthesized and radiolabelled. This single strand of cDNA can 
be made from polyA mRNA or from the existing cDNA library. The radiolabelled cDNA is hybridized to a large excess 
of mRNA from a closely related cell population. After hybridization the cDNA:mRNA hybrids are removed from the 
solution by chromatography on a hydroxyl apatite column. The remaining "subtracted" radiolabelled cDNA can then be 

55 used to screen a cDNA or genomic DNA library of the same cell population. 

[0012] Subtractive hybridization removes the majority of the genes expressed in both cell populations and thus en- 
riches for genes which are present only in the desired cell population. However, if the expression of a particular mRNA 
sequence is only a few times more abundant in the desired cell population than the subtractive population it may not 
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be possible to isolate the gene by subtractive hybridization. 

[001 3] Proc. Natl. Acad. Scie. USA Vol. 86, pp. 5673-5677, August 1 989 Biochemistry discloses one-sided polymer- 
ase chain reaction: The amplification of cDNA a rapid technique, based on the polymerase chain reaction (PGR), for 
the direct targeting, enhancement, and sequencing of previously uncharacterized cDNAs. This method is not limited 
5 to previously sequenced transcripts, since it requires only two adjacent or partially overtapping specific primers from 
only one side of the region to be amplified. These primers can be located anywhere within the message. The specific 
primers are used in conjunction with nonspecific primers targeted either to the poly(A)'*" region of the message or to 
an enzymatically synthesized d(A) tail. 

[0014] Painvise combinations of specific and general primers allow for the amplification of regions both 3' and 5' to 

10 the point of entry into the message. The amplified PGR products can be cloned, sequenced directly by genomic se- 
quencing, or labeled for sequencing by amplifying with a radioactive primer. We illustrate the power of this approach 
by deriving the cDNA sequences for the skeletal muscle a-tropomyosins of European common frog {Rana temporaria) 
and zebrafish {Brachydanio rerio) using only 300 ng of a total poIy(A)"*' preparation. In these examples, we gained initial 
entry into the tropomyosin messages by using heterologous primers (to conserved regions) derived from the rat skeletal 

15 muscle a-tropomyosin sequence. The frog and zebrafish sequences are used in an analysis of tropomyosin evolution 
across the vertebrate phylogenetic spectrum. The results underscore the conservative nature of the tropomyosin mol- 
ecule and support the notion of a constrained heptapeptide unit as the fundamental structural motif of tropomyosin. 
[0015] Nucleic Acids Research, Vol. 19, No. 7, discloses efficient double stranded sequencing of cDNA clones con- 
taining long poty{A) tails using anchored poly(DT) primers. Sequencing double stranded DNA templates has become 

20 a common and efficient procedure (1 ) for rapidly obtaining sequence data while avoiding preparation of single stranded 
DNA. Here we report the applicability of this procedure to sequencing cDNA clones containing long stretches of poly 
(A). Double stranded templates of cDNAs containing long poly(A) tracts are difficult to sequence with vector primers 
(e.g. universal Ml 3) which anneal downstream of the poly(A) tail. Sequencing with these primers results in a long poly 
(T) ladder followed by a sequence which is difficult to read (Fig. 1 ). In an attempt to solve this problem we synthesized 

25 three primers which contain (dT)i7 and either (dA) or (dG) or (dG) at the 3' end. We reasoned that the presence of 
these three bases at the 3' end would 'anchor* the primers at the upstream end of the poly(A) tail and allow sequencing 
of the region immediately upstream of the poly(A) region. 

[0016] Anchored primers were synthesized on an Applied Biosystems (ABI) 391 DNA synthesizer and used after 
purification on Oligonucleotide Purification Gartridges (ABI). For sequencing with anchored primers, 5-1 0 ^g of plasmid 

30 DNA was denatured in a total volume of 50 ^1 containing 0.2 M sodium hydroxide and 0.16 mM EDTA by incubation 
at 65°G for 1 0 minutes. The three poly(dT) anchored primers (2 pmol of each) were added and the mixture immediately 
placed on ice. The solution was then neutralized by the addition of 5 |i.l of 5 M ammonium acetate pH 7.0. The DNA 
was precipitated by addition of 150 \i\ of cold 95% ethanol and the pellet washed twice with cold 70% ethanol. The 
pellet was dried for 5 minutes and then resuspended in 1 x sequencing buffer (1 x = 40 mM Tris-HGI pH 7.5, 20 mM 

35 MgGI, 50 mM NaCI). Primers were annealed by heating the solution for 2 minutes at 65°G followed by slow cooling to 
room temperature. Sequencing reactions, using modified T7 DNA polymerase (Sequenase, United States Biochenu- 
cals). were then carried out using p2p]a-dATP (> 1000 Gi/mmole) according to the protocol supplied with the Seque- 
nase kit. Under these conditions over 300 bp of readable sequence could be obtained (Fig. 1). We have applied this 
approach to several other poly(A)-containing cDNA clones with similar results. Sequencing of the opposite strand of 

40 these cDNAs using insert-specific primers verified that the sequences obtained with the anchored primers occurred 
directly upstream of the poly(A) region (data not shown). 

[001 7J The ability to directly obtain sequence Immediately upstream from the poly(A) tail of cDNAs, as demonstrated 
here, should be of particular importance to large scale efforts to gene sequence-tagged sites (STSs) (2) from cDNAs (3). 
[0018] Nucleic Acids Research, Vol. 19, No. 13 3747 discloses a novel 3' extension technique using random primers 
45 in RNA-PGT 

[0019] In order to obtain sequence 3' to a partial --2 kb titin seqcence of the >21 kb titin mRNA (1) that was too 
distant from the poly A tail for 3' RAGE methodologies (2). RNA-PGR (3, 4) was done using a primer containing a 
random hexamer at its 3' end. Four fig of rabbit cardiac muscle total RNA (5) in 25 were reverse transcribed per 
BRL's recommendations using 100 ng RT primer (Figure 1) and 200 U MLV reverse transcriptase (BRL). After RNAse 

50 H digestion, 10 p.L was used for PGR in 100 fiL using primers complementary to either known titin sequence (Figure 
1. TS 1) or the RT primer (Figure 1, Y primer), and 30 cycles of 93°G-45 sec, 45°G-1.5 min, 72°G-3.0 min. Although 
defined fragments from 100-1000 bp were observed after the first PGR (Figure 2a) fragments only, <700 bp purified 
by Geneclean (Bio 101) re-amplified during the second PGR using primers complementary to the RT primer (Figure 
1. X primer Containing a Sail site) or the known titin sequence (Figure 1, TS2 containing a NotI site). Lower or higher 

55 concentrations of RT primer resulted in no or very small amplification products, respectively. Only the random hexamer 
part of the RT primer initiated reverse transcription from 6-bp sequences of titin mRNA that had at least 50% G/C 
content. 

[0020] Final amplification products (Figure 2b) were sequenced by dideoxy chain termination methods using Seque- 
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nase (US Biochemical) after digestion with Not! and Sail restriction enzymes and ligation into pBluescript (Stratagene). 
Two clones were sequenced because of the possibility of infidelity of the Taq polymerase. After repeating this entire 
procedure three times using new sets sof titin-specific primers, the titin sequence was extended 1109-bp (EMBL ac- 
cession no. X59596). The second amplification step could perhaps be omitted or done asymmetrically for sequencing, 
5 [0021] This technique should also be applicable to 5' exiensions of cDNA clones. Perhaps a modification of it could 
be applied to extensions of known genomic DNA in either direction. 

[0022] The Journal of Cell Biology, Vol. 115, No. 4, November 1991, 887-903 discloses an analysis of vertebrate 
mRNA sequences: intimations of translational control, 

[0023] Five structural features in mRNAs have been found to contribute to the fidelity and efficiency of initiation by 

?o eukaryotic ribosomes. Scrutiny of vertebrate cDNA sequences in light of these criteria reveals a set of transcripts- 
encoding oncoproteins, growth factors, transcription factors, and other regulatory proteins-that seem designed to be 
translated poorly. Thus, throttling at the level of translation may be a critical component of gene regulation in vertebrates. 
An alternative interpretation is that some (perhaps many) cDNAs with encumbered 5' noncoding sequences represent 
mRNA precursors, which would imply extensive regulation at a posttranscriptional step that precedes translation. 

t5 [0024] We have discovered a method for identifying, isolating and cloning mRNAs as cDNAs using a polymerase 
amplification method that employs at least two oligodeoxynucleotide primers. In one approach, the first primer contains 
sequence capable of hybridizing to a site including sequence that is immediately upstream of the first A ribonucleotide 
of the mRNA's polyA tail and the second primer contains arbitrary sequence. In another approach, the first primer 
contains sequence capable of hybridizing to a site including the mRNA's polyA signal sequence and the second primer 

20 contains arbitrary sequence. In another approach, the first primer contains arbitrary sequence and the second primer 
contains sequence capable of hybridizing to a site including the mRNA's Kozak sequence. In another approach, the 
first primer contains a sequence that is substantially complementary to the sequence of a mRNA having a known 
sequence and the second primer contains arbitrary sequence. In another approach, the first primer contains arbitiacy 
sequence and the second primer contains sequence that is substantially identical to the sequence of a mRNA having 

25 a known sequence. The first primer is used as a primer for reverse transcription of the mRNA and the resultant cDNA 
is amplified with a polymerase using both the first and second primers as a primer set. 

[0025] Using this method with different pairs of the alterable primers, virtually any or all of the mRNAs from any cell 
type or any stage of the cell cycle, including very low abundance mRNAs, can be identified and isolated. Additionally 
a comparison of the mRNAs from closely related cells, which may be for example at different stages of development 
30 or different stages of the cell cycle, can show which of the mRNAs are constitutively expressed and which are differ- 
entially expressed, and their respective frequencies of expression. 

[0026] The "first primer" or "first oligodeoxynucleotide" as used herein is defined as being the oligodeoxynucleotide 
primer that is used for the reverse transcription of the mRNA to make the first cDNA strand, and then is also used for 
amplification of the cDNA. The first primer can also be referred to as the 3' primer, as this primer will hybridize to the 
35 mRNA and will define the 3' end of the first cDNA strand. The "second primer" as used herein is defined as being the 
oligodeoxynucleotide primer that is used to make the second cDNA strand, and is also used for the amplification of 
the cDNA. The second primer may also be referred to as the 5* primer, as this primer will hybridize to the first cDNA 
strand and will define the 5' end of the second cDNA strand. 

[0027] The "arbitrary'* sequence of an oligodeoxynucleotide primer as used herein is defined as being based upon 
40 or subject to individual judgement or discretion. In some instances, the arbitrary sequence can be entirely random or 
partly random for one or more bases. In other instances the arbitrary sequence can be selected to contain a specific 
ratio of each deoxynucleotide, for example approximately equal proportions of each deoxynucleotide or predominantly 
one deoxynucleotide, or to not contain a specific deoxynucleotide. The arbitrary sequence can be selected to contain, 
or not to contain, a recognition site for specific restriction endonuclease. The arbitrary sequence can be selected to 
45 either contain a sequence that is substantially identical (at least 50 homologous) to a mRNA of known sequence or to 
not contain sequence from a mRNA of known sequence. 

[0028] An oligodeoxynuceotide primer can be either "complementary" to a sequence or "substantially identical" to a 
sequence. As defined herein, a complementary oligodeoxynucleotide primer is a primer that contains a sequence which 
will hybridize to an mRNA, that is the bases are complementary to each other and a reverse transcriptase will be able 

50 to extend the primer to form a cDNA strand of the mRNA. As defined herein, a substantially identical primer is a primer 
that contains sequence which is the same as the sequence of an mRNA, that is greater than 50% identical, and the 
primer has the same orientation as an mRNA thus it will not hybridize to, or complement, an mRNA but such a primer 
can be used to hybridize to the first cDNA strand and can be extended by a polymerase to generate the second cDNA 
strand. The terms of art "hybridization" or "hybridize", as used herein, are defined to be the base pairing of an oligo- 

55 deoxynucleotide primer with a mRNA or cDNA strand. The "conditions under which" an oligodeoxynucleotide hybridizes 
with an mRNA or a cDNA, as used herein, is defined to be temperature and buffer conditions (that are described later) 
under which the base pairing of the oligodeoxynucleotide primer with either an mRNA or a cDNA occurs and only a 
few mismatches (one or two) of the base pairing are permissible. 
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[0029] An oligonucleotide primer can contain a sequence that is known to be a "consensus sequence" of an mRNA 
of known sequence. As defined herein, a "consensus sequence" is a sequence that has been found in a gene family 
of proteins having a similar function or similar properties. The use of a primer that includes a consensus sequence 
may result in the cloning of additional members of a desired gene family. 

5 [0030] The "preferred length" of an oligodeoxynucleotide primer, as used herein, is determined from the desired 
specificity of annealing and the number of oligodeoxynucleotides having the desired specificity that are required to 
hybridize to all the mRNAs in a cell. An oligodeoxynucleotide primer of 20 nucleotides is more specific than an oligo- 
deoxynucleotide primer of 10 nucleotides; however, addition of each random nucleotide to an oligodeoxynucleotide 
primer increases by four the number of oligodeoxynucleotide primers required in order to hybridize to every mRNA in 

10 a cell. 

[0031] In one aspect, in general, the invention features a method for identifying and isolating mRNAs by priming a 
preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer that contains sequence capable 
of hybridizing to a site including sequence that is immediately upstream of the first A ribonucleotide of the mRNA's 
polyA tail, and amplifying the cDNA by a polymerase amplification method using the first primer and a second oligo- 

15 deoxynucteotide primer, for example a primer having arbitrary sequence, as a primer set. 

[0032] In preferred embodiments, the first primer contains at least 1 nucleotide at the 3' end of the oligodeoxynucle- 
otide that can hybridize to an mRNA sequence that is immediately upstream of the polyA tail, and contains at least 11 
nucleotides at the 5* end that will hybridize to the polyA tail. The entire 3' oligodeoxynucleotide is preferably at least 
13 nucleotides in length, and can be up to 20 nucleotides in length. 

20 [0033] Most preferably, the first primer contains 2 nucleotides at the 3' end of the oligodeoxynucleotide that can 
hybridize to an mRNA sequence that is immediately upstream of the polyA tail. Preferably, the 2 polyA-non-comple- 
mentary nucleotides are of the sequence VN, where V is deoxyadenylate ("dA"), deoxyguanylate ("dG"), or deoxycyti- 
dylate ("dC"), and N, the 3' terminal nucleotide, is dA, dG, dC, or deoxythymidylate ("dT"). Thus the sequence of a 
preferred first primer is 5'-TTTTTTTTTTTVN [Seq. ID. No. 1]. The use of 2 nucleotides can provide accurate positioning 

25 of the first primer at the junction between the mRNA and its polyA tail, as the properly aligned oligodeoxynucleotide: 
mRNA hybrids are more stable than improperly aligned hybrids, and thus the properly aligned hybrids will form and 
remain hybridized at higher temperatures. In preferred applications, the mRNA sample will be divided into at least 
twelve aliquots and one of the 12 possible VN sequences of the first primer will be used in each reaction to prime the 
reverse transcription of the mRNA. The use of an oligodeoxynucleotide with a single sequence will reduce the number 

30 of mRNAs to be analyzed in each sample by binding to a subset of the mRNAs, statistically 1/1 2th, thus simplifying 
the identification of the mRNAs in each sample. 

[0034] In some embodiments, the 3' end of the first primer can have 1 nucleotide that can hybridize to an mRNA 
sequence that is immediately upstream of the polyA tail, and 1 2 nucleotides at the 5' end that will hybridize to the polyA 
tail, thus the primer will have the sequence 5'-TTTTTTTTTTTTV [Seq. ID. No. 2]. The use of a single non-polyA- 
35 complementary deoxynucleotide would decrease the number of oligodeoxynucleotides that are required to identify 
every mRNA to 3, however, the use of a single nucleotide to position the annealing of primer to the junction of the 
mRNA sequence and the polyA tail may result in a significant loss of specificity of the annealing and 2 non-polyA- 
complementary nucleotides are preferred. 

[0035] In some embodiments, the 3' end of the first primer can have 3 or more nucleotides that can hybridize to an 

40 mRNA sequence that is immediately upstream of the polyA tail. The addition of each nucleotide to the 3' end will further 
increase the stability of properly aligned hybrids, and the sequence to hybridize to the polyA tail can be decreased by 
one nucleotide for each additional non-polyA-complementary nucleotide added. The use of such a first primer may not 
be practical for rapid screening of the mRNAs contained within a given cell line, as the use of a first primer with more 
than 2 nucleotides that hybridize to the mRNA immediately upstream of the polyA tail significantly increases the number 

45 of oligodeoxynucleotides required to identify every mRNA. For instance, the primer 5'-TTTTTTTTTTVNN [Seq. ID. No. 
3] would require the use of 48 separate first primers in order to bind to every mRNA, and would significantly increase 
the number of reactions required to screen the mRNA from a given cell line. The use of oligodeoxynucleotides with a 
single random nucleotide in one position as a group of four can circumvent the problem of needing to set up 48 separate 
reactions in order to identify every mRNA. However as the non-polyA-complementary sequence became longer, it 

50 would quickly become necessary to increase the number of reactions required to identify every mRNA. 

[0036] In preferred embodiments, the second primer is of arbitrary sequence and is at least 9 nucleotides in length. 
Preferably the second primer is at most 13 nucleotides in length and can be up to 20 nucleotides in length. 
[0037] In another aspect, in general, the invention features a method for preparing and isolating mRNAs by priming 
a preparation of mRNA for reverse transcription with a first primer that contains a sequence capable of hybridizing to 

55 the polyadenylation signal sequence and at least 4 nucleotides that are positioned 5', or 3', or both of the polyadenylation 
signal sequence; this entire first primer is preferably at least 10 nucleotides in length, and can be up to 20 nucleotides 
in length. In one preferred embodiment the sequence S'-NNTTTATTNN [Seq. ID. No. 4] can be chosen such that the 
sequence is 5'-GCMITATTNC [Seq. ID. No. 5], and the four resultant primers are used together in a single reaction for 
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the priming of tlie mRNA for reverse transcription. Once the first cDNA strand has been formed by reverse transcription 
then the first primer can be used with a second primer, for example and arbitrary sequence primer, for the amplification 
of the cDNA. 

[0038] In one aspect, in general, the invention featums a method for identifying and isolating mRNAs by priming a 
5 preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer to generate a first cDNA strand, 
and priming the preparation of the second cDNA strand with a second primer that contains sequence substantially 
identical to the Kozak sequence of mRNA, and amplifying the cDNA by a polymerase amplification method using the 
first and second primers as a primer set. 

[0039] In preferred embodiments, the first and second primers are at least 9 deoxynucleotides in length, and are at 
10 most 1 3 nucleotides in length, and can be up to 20 nucleotides in length. Most preferably the first and second primers 
are 10 deoxynucleotides in length. 

[0040] In preferred embodiments the sequence of the first primer is selected at random, or the first primer contains 
a selected arbitrary sequence, or the first primer contains a restriction endonuclease recognition sequence. 
[0041] In preferred embodiments the sequence of the second primer that contains sequence substantially identical 
15 to the Kozak sequence of mRNA has the sequence NNNANNATGN [Seq. ID No. 6], or has the sequence NNNAN- 
NATGG [Seq. ID No. 7]. Where N is any of the four deoxynucleotides. Preferably, the second primer has the sequence 
GCCACCATGG [Seq. ID No. 8]. In some embodiments the first primer may further include a restriction endonuclease 
recognition sequence that is added to either the 5' or 3' end of the primer increasing the length of the primer by at least 
5 nucleotides. 

20 [0042] In another aspect, in general, the invention features a method for identifying and Isolating mRNAs by priming 
a preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer that contains sequence that 
is substantially complementary to the sequence of a mRNA having a known sequence, and priming the preparation of 
the second cDNA strand with a second primer and, amplifying the cDNA by a polymerase amplification method using 
the first and second primers as a primer set. 

25 [0043] In preferred embodiments, the first and second primers are at least 9 deoxynucleotides in length, and are at 
most 13 nucleotides in length, and can be up to 20 nucleotides in length. Most preferably the first and second primers 
are 10 deoxynucleotides in length. 

[0044] In preferred embodiments the sequence of the first primer further includes a restriction endonuclease se- 
quence, which may be included within the preferred 10 nucleotides of the primer or may be added to either the 3' or 

30 5' end of the primer increasing the length of the oligodeoxynucleotide primer by at least 5 nucleotides. 

[0045] In preferred embodiments the sequence of the second primer is selected at random, or the second primer 
contains a selected arbitrary sequence, or the second primer contains a restriction endonuclease recognition sequence. 
[0046] In another aspect, in general, the invention features a method for identifying and isolating mRNAs by priming 
a preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer, and priming the preparation 

35 of the second cDNA strand with a second primer that contains sequence that is substantially identical to the sequence 
of a mRNA having a known sequence and, amplifying the cDNA by a polymerase amplification method using the first 
and second primers as a primer set. 

[0047] In preferred embodiments, the first and second primers are at least 9 deoxynucleotides in length, and are at 
most 13 nucleotides in length, and can be up to 20 nucleotides in length. Most preferably the first and second primers 

40 are 10 deoxynucleotides in length. 

[0048] In preferred embodiments the sequence of the first primer is selected at random, or the first primer contains 
a selected arbitrary sequence, or the first primer contains a restriction endonuclease recognition sequence. 
[0049] In preferred embodiments the sequence of the second primer having a sequence that is substantially com- 
plementary to the sequence of an mRNA having a known sequence further includes a restriction endonuclease se- 

45 quence, which may be included within the preferred 10 nucleotides of the primer or may be added to either the 3' or 
5* end of the primer increasing the length of the oligodeoxynucleotide primer by at least 5 nucleotides. 
[0050] In another aspect, in general, the invention features a method for identifying and isolating mRNAs by priming 
a preparation of mRNA for reverse transcription with a first oligodeoxynucleotide primer that contains sequence that 
is substantially complementary to the sequence of a mRNA having a known sequence, and priming the preparation of 

50 the second cDNA strand with a second primer that contains sequence that is substantially identical to the Kozak se- 
quence of mRNA, and amplifying the cDNA by a polymerase amplification method using the first and second primers 
as a primer set. 

[0051] In preferred embodiments, the first and second primers are at least 9 deoxynucleotides in length, and are at 
most 1 3 nucleotides in length, and can be up to 20 nucleotides in length. Most preferably the first and second primers 
55 are 10 deoxynucleotides in length. 

[0052] In some preferred embodiments of each of the general aspects of the invention, the amplified cDNAs are 
separated and then the desired cDNAs are reamplified using a polymerase amplification reaction and the first and 
second oligodeoxynucleotide primers. 



7 



4 



EP 0 592 626 B1 

[0053] In preferred embodiments of each of the general aspects of the invention, a set of first and second oligode- 
oxynucleotide primers can be used, consisting of more than one of each primer. In some embodiments more than one 
of the first primer will be included in the reverse transcription reaction and more than one each of the first and second 
primers will be included in the amplification reactions. The use of more than one of each primer will increase the number 

5 of mRNAs identified in each reaction, and the total number of primers to be used will be determined based upon the 
desired method of separating the cDNAs such that it remains possible to fully isolate each individual cDNA. In preferred 
embodiments a few hundred cDNAs can be isolated and identified using denaturing polyacrylamide gel electrophoresis. 
[0054] The method according to the invention is a significant advance over current cloning techniques that utilize 
subtractive hybridization. In one aspect, the method according to the invention enables the genes which are altered 

10 in their frequency of expression, as well as of mRNAs which are constitu lively and differentially expressed, to be iden- 
tified by simple visual inspection and isolated. In another aspect the method according to the invention provides specific 
oligodeoxynucleotide primers for amplification of the desired mRNA as cDNA and makes unnecessary an intermediary 
step of adding a homopolymeric tail to the first cDNA strand for priming of the second cDN A strand and thereby avoiding 
any interference from the homopolymeric tail with subsequent analysis of the isolated gene and its product. In another 

15 aspect the method according to the invention allows the cloning and sequencing of selected mRNAs. so that the in- 
vestigator may determine the relative desirability of the gene prior to screening a comprehensive cDNA library for the 
full length gene product. 

Description of the Preferred Embodinnents 

20 

Drawings 

[0055] Fig. 1 is a schematic representation of the method according to the invention. 

[0056] Fig. 2 is the sequence of the 3' end of the N1 gene from normal mouse fibroblast cells (A31 ) [Seq. ID. No. 9]. 
25 [0057] Fig. 3 is the Northern blot of the N1 sequence on total cellular RNA from normal and tumorigenic mouse 
fibroblast cells. 

[0058] Fig. 4 is a sequencing gel showing the results of amplification for mRNA prepared from four sources (lanes 
14), using the Kozak primer alone, the AP-1 primer alone, the Kozak and AP-1 primers, the Kozak and AP-2 primers, 
the Kozak and AP-3 primers, the Kozk and AP-4 primers and the Kozak and AP-5 primers. This gel will be more fully 
30 described later. 

[0059] Fig. 5 is a partial sequence of the 5' end of a clone, K1, that was cloned from the A1-5 cell line that was 
cultured at the non-permissive temperature and then shifted to the permissive temperature (32.5°C) for 24 h prior to 
the preparation of the mRNA. The A1-5 cell line is from a primary rat embryo fibroblast cell line that has been doubly 
transformed with ras and a temperature sensitive mutation of P^^ (-p53ts")^ 

35 

General Description, Development of the Method 

[0060] By way of illustration a description of examples of the method of the invention follows, with a description by 
way of guidance of how the particular illustrative examples were developed. 

40 [0061 ] It is important for operation of the method that the length of the oligodeoxynucleotide be appropriate for specific 
hybridization to mRNA. In order to obtain specific hybridization, whether for conventional cloning methods or PGR, 
oligodeoxynucleotides are usually chosen to be 20 or more nucleotides in length. The use of long oligodeoxynucleotides 
in this instance would decrease the number of mRNAs identified during each trial and would greatly increase the 
number of oligodeoxynucleotides required to identify every mRNA. Recently, it was demonstrated that 9-1 0 nucleotide 

45 primers can be used for DNA polymorphism analysis by PGR (Williams et aL, 1991, Nuc. Acids Res., Vol. 18, pp. 
6531-6535). 

[0062] The plasmid containing the cloned murine thymidine kinase gene ("TK cDNA plasmid") was used as a model 
template to determine the required lengths of oligodeoxynucleotides for specific hybridization to a mRNA, and for the 
production of specific PGR products. The oligodeoxynucleotide primer chosen to hybridize internally in the mRNA was 

50 varied between 6 and 1 3 nucleotides in length, and the oligodeoxynucleotide primer chosen to hybridize at the upstream 
end of the polyA tail was varied between 7 and 14 nucleotides in length. After numerous trials with different sets and 
lengths of primers, it was determined that the annealing temperature of 42°G is optimal for product specificity and the 
internally hybridizing oligodeoxynucleotide should be at least 9 nucleotides in length and a oligodeoxynucleotide that 
is at least 13 nucleotides in length is requirsd to bind to the upstream end of the polyA tail. 

55 [0063] With reference now to Fig. 1, the method according to the invention is depicted schematically. The mRNAs 
are mixed with the first primer, for example TTTTTTTTTTTVN [Seq. ID. No. 2] (T^VN) 1 , and reverse transcribed 2 to 
make the first cDNA strand. The cDNA is amplified as follows. The first cDNA strand is added to the second primer 
and the first primer and the polymerase in the standard buffer with the appropriate concentrations of nucleotides and 
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the components are heated to 94°C to denature the mRNAxDNA hybrid 3, the temperature is reduced to 42*'C to allow 
the second primer to anneal 4, and then the temperature is increased to 72°C to allow the polymerase to extend the 
second primer 5. The cycling of the temperature is then repeated 6, 7, 8, to begin the amplification of the sequences 
which are hybridized by the first and second primers. The temperature is cycled until the desired number of copies of 

5 each sequence have been made. 

[0064] As is well known in the art, this amplification method can be accomplished using thermal stable polymerase 
or a polymerase that is not thermal stable. When a polymerase that is not thermal stable is used, fresh polymerase 
must be added after the annealing of the primers to the templates at the start of the elongation or extending step, and 
the extension step must be carried out at a temperature that is permissible for the chosen polymerase. 

10 [0065] The following examples of the method of the invention are presented for illustrative purposes only. As will be 
appreciated, the method according to the invention can be used for the isolation of poIyA mRNA from any source and 
can be used to isolate genes expressed either differentially or constitutively at any level, from rare to abundant. 

Example 1 

15 

[0066] Experimentation with the conditions required for accurate and reproducible results by PGR were conducted 
with the TK cDNA plasmid and a single set of oligodeoxynucleotide primers; the sequence TTTTTTTTTTTCA ("T^iCA") 
[Seq. ID. No. 10] was chosen to hybridize to the upstream end of the polyA tail and the sequence CTTGATTGCC 
("Ltk3") [Seq. ID. No. 11] was chosen to hybridize 288 base pairs ("bp") upstream of the polyA tail. The expected 

20 fragment size using these two primers is 299 bp. 

[0067] PGR was conducted under standard buffer conditions well known in the art with 10 ng TK cDNA plasmid 
(buffer and polymerase are available from Perkin Elmer-Getus). The standard conditions were altered in that the primers 
were used at concentrations of 2.5 m,M T^GA [Seq. ID. No.lO], 0.5 fiM Ltk3 [Seq. ID. No. 11], instead of 1 |iM of each 
primer. The concentration of the nucleotides ("dNTPs") was also varied over a 100 fold range, from the standard 200 

25 to 2 ^iM. The PGR parameters were 40 cycles of a denaturing step for 30 seconds at 94°G, an annealing step for 

1 minute at 42°G, and an extension step for 30 seconds at 72°G. Significant amounts of non-specific PGR products 
were observed when the dNTP concentration was 200 |j,M, concentrations of dNTPs at or below 20 [iM yielded spe- 
cifically amplified PGR products. The specificity of the PGR products was verified by restriction endonuclease digest 
of the amplified DNA, which yielded the expected sizes of restriction fragments. In some instances it was found that 

30 the use of up to 5 fold more of the first primer than the second primer also functioned to increase the specificity of the 
product. Lowering the dNTP concentration to 2 |xM allowed the labelling of the PGR products to a high specific activity 
with [a-35S] dATP, 0.5 fiM [a-^ss] dATP (Sp. Act. 1 200 Gi/mmol), which is necessary for distinguishing the PGR products 
when resolved by high resolution denaturing polyacrylamide gel electrophoresis, in this case a DNA sequencing gel. 

35 Example 2 

[0068] The PGR method of amplification with short oligodeoxynucleotide primers was then used to detect a subset 
of mRNAs in mammalian cells. Total RNAs and mRNAs were prepared from mouse fibroblasts cells which were either 
growing normally, "cycling", or serum starved, "quiescent". The RNAs and mRNAs were reverse transcribed with T^^GA 

40 [Seq. ID. No. 10] as the primer. The T^^GA primer [Seq. ID. No. 10] was annealed to the mRNA by heating the mRNA 
and primer together to 65°G and allowing the mixture to gradually cool to 35°G. The reverse transcription reaction was 
carried out with Moloney murine leukemia virus reverse transcriptase at 35°G. The resultant cDNAs were amplified by 
PGR in the presence of T^^GA [Seq. ID. No. 10] and Ltk3 [Seq. ID. No. 11], as described in Example 1, using 2 \lM 
dNTPs. The use of the T^^GA [Seq. ID. No. 10] and Ltk3 [Seq. ID. No. 11] primers allowed the TK mRNA to be used 

45 as an internal control for differential expression of a rare mRNA transcript; TK mRNA is present at approximately 30 
copies per cell. The DNA sequencing gel revealed 50 to 100 amplified mRNAs in the size range which is optimal for 
further analysis, between 1 00 to 500 nucleotides. The patterns of the mRNA species observed in cycling and quiescent 
cells were very similar as expected, though some differences were apparent. Notably, the TK gene mRNA, which is 
expressed during G1 and S phase, was found only in the RNA preparations from cycling cells, as expected, thus 

50 demonstrating the ability of this method to separate and isolate rare mRNA species such as TK. 

Example 3 

[0069] The expression of mRNAs in normal and tumorigenic mouse fibroblast cells was also compared using the 
55 TiiGA[Seq. ID. No. 10]andLtk3[Seq. ID. No. 11] primers for the PGR amplification. The mRNA was reverse transcribed 
using T^^GA [Seq. ID. No. 10] as the primer and the resultant cDNA was amplified by PGR using 2 ^iM dNTPs and the 
PGR parameters described above. The PGR products were separated on a DNA sequencing gel. The TK mRNA was 
present at the same level in both the normal and tumorigenic mRNA preparations, as expected, and provided a good 



9 



EP 0 592 626 B1 

internal control to demonstrate the representation of rare mRNA species. Several other bands were present in one 
preparation and not in the other, with a few bands present in only the mRNA from normal cells and a few bands present 
only in the mRNA from the tumorigenic cells; and some bands were expressed to different levels in the normal and 
tumorigenic cells. Thus, the method according to the Invention can be used to identify genes which are normally con- 
5 tinuously expressed (constitutive), and differentially expressed, suppressed, or otherwise altered in their level of ex- 
pression. 

Cloning of the mRNA identified in Example 3 

10 [0070] Three cDNAs that are, the TK cDNA, one cDNA expressed only in normal cells ("N1"), and one cDNA ex- 
pressed only in tumorigenic cells ("Tl"), were recovered from the DNA sequencing gel by electroelution, ethanol pre- 
cipitated to remove the urea and other contaminants, and reamplified by PGR, in two consecutive PGR amplifications 
of 40 cycles each, with the primers T^^CA [Seq. ID. No. 10] and Uk3 [Seq. ID. No. 11] in the presence of 20 ^iM dNTPs 
to achieve optimal yield without compromising the specificity. The reamplified PGR products were confirmed to have 

15 the appropriate sizes and primer dependencies as an additional control the reamplified TK cDNA was digested with 
two separate restriction endonucleases and the digestion products were also confirmed to be of the correct size. 
[0071] The reamplified N1 [Seq. ID. No. 9] was cloned with the TA cloning system, Invitrogen Inc., into the plasmid 
pCRIOOO and sequenced. With reference now to Fig. 2, the nucleotide sequence clearly shows the N1 fragment [Seq. 
ID. No. 9] to be flanked by the underlined LtkS primer 15 at the 5' end and the underlined T^^CA primer 16 at the 3* 

20 end as expected. 

[0072] A Northern analysis of total cellular RNA using a radiolabelled N1 probe reconfirmed that the N1 mRNA was 
only present in the normal mouse fibroblast cells, and not in the tumorigenic mouse fibroblast cells. With reference 
now to Fig. 3, the probe used to detect the mRNA is labelled to the right of the figure, and the size of the N1 mRNA 
can be estimated from the 28S and 18S markers depicted to the left of the figure. The N1 mRNA is present at low 
25 abundance in both exponentially growing and quiescent normal cells, lanes 1 and 3, and is absent from both expo- 
nentially growing or quiescent tumorigenic cells, lanes 2 and 4, As a control, the same Northern blot was reprobed with 
a radiolabelled probe for 36B4, a gene that is expressed in both normal and tumorigenic cells, to demonstrate that 
equal amounts of mRNA, lanes 1-4, were present on the Northern blot. 

30 Example 4 

[0073] The comparison of the expression of mRNAs in three cell lines, one of which was tested after culturing under 
two different conditions, was conducted. The cell lines were a primary rat embryo fibroblast cell line ("REF"), the REF 
cell line that has been doubly transformed with ras and a mutant of P^s (''T101-4''), and the REF cell line that has been 
35 doubly transformed with ras and a temperature sensitive mutation of p53 ("A1-5"). The A1-5 cell line was cultured at 
the non-permissive temperature of 37°G, and also cultured at 37°C then shifted to the permissive temperature of 32.5°G 
for 24 h prior to the preparation of the mRNA. The method of the invention was conducted using the primers "Kozak" 
and one of five arbitrary sequence primers, "AP-1, AP-2, AP-3. AP-4, or AP-5", as the second and first primers, re- 
spectively. 

40 [0074] The sequence of the "Kozak" primer was chosen based upon the published consensus sequence for the 
translation start site consensus sequence of mRNAs (Kozak, 1991, Jour. Cell Biology, Vol. 115, pp. 887-903). A de- 
generate Kozak primer having sequences substantially identical to the translation start site consensus sequence were 
used simultaneously, these sequences were 5'-GGGRGCATGG [Seq. ID No. 12], in which the R is dA ordG and thus 
the oligodeoxynucleotide primer has only one of the given nucleotides which results in a mixture of primers. 

45 [0075] The sequence of the five arbitrary primers was a follows: AP-1 had the sequence 5'-AGGCAGCGAA [Seq. 
ID. No. 13]; AP-2 had the sequence 5'-GACCGCTTGT [Seq. ID. No. 14]; AP-3 had the sequence 5'-AGGTGAGGGT 
[Seq. ID. No. 15]; AP-4 had the sequence 5'-GGTAGTGGAG [Seq. ID. No. 16]; and AP-5 had the sequence 5'-GTT- 
GGGATGG [Seq. ID. No. 17]. These arbitrary sequence primers were chosen arbitrarily. In general each arbitrary 
sequence primer was chosen to have a GG content of 50-70%. 

50 [0076] The mRNA was reverse transcribed using one of the AP primers, as the first primer, and the resultant first 
cDNA strand was amplified in the presence of both primers, the AP primer and the degenerate Kozak primer, by PGR 
using 2 NTPs and the PGR parameters described above. The PGR products were separated on a DNA sequencing 
gel. At least 50-1 00 amplified cDN A bands were present in each of the cell lines tested , and some bands were expressed 
to different levels in the different cell lines. As a control a reaction was conducted using each arbitrary primer in the 

55 absence of the Kozak primer. No cDNA was generated by the arbitrary primer alone, thus demonstrating that both 
primers were required to amplify an mRNA into a cDNA. 

[0077] With reference now to Fig. 4, the primer sets used for each reaction are shown at the top of the Fig, along 
the line marked Primers. As a control a reaction was conducted using the primers in the absence of mRNA, and using 
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AP-1 with mRNA in the absence of the Kozak primer. No cDNA was genezated by the primers in the absence of mRNA 
or by the arbitrary primer alone, thus demonstrating that mRNA is required for amplification and that both primers were 
required to amplify an mRNA into a cDNA. The cDNA products of the amplification were loaded in the same order 
across the get, thus the REF cell line is shown in each of lanes 1 , cell line T1 01 -4 is shown in each of lanes 2, cell line 

5 A1-5 cultured at 37°C is shown in each of lanes 3, and cell line A1-5 cultured at 32.5°C is shown in each of lanes 4. 
Each pair of primers resulted in the amplification of a different set of mRNAs from the cell lines. The reactions which 
were conducted using the Kozak primer and any of primers AP-1 , AP-2, AP-4, or AP-5 as a primer set resulted in the 
amplification of the same cDNA pattern from each of cell lines REF, T101-4, A1-5 cultured at 37°C and A1-5 cultured 
at 32.5°C. The amplification of mRNA from each cell line and temperature using the Kozak degenerate primer and the 

10 AP-3 primer resulted in the finding of one band in particular which was present in the mRNA prepared from the A1-5 
cell line when cultured at 32.5°C for 24 h, and not in any of the other mRNA preparations, as can be seen in Fig. 4 
designated as K^. Thus the method according to the invention may be used to identify genes which are differentially 
expressed in mutant cell lines. 

15 Cloning of the mRNA identified in Example 4 

[0078] The cDNA ("K1") that was expressed only in the A1-5 cell line when cultured at 32.5°C was recovered from 
the DNA sequencing gel and reamplified using the primers Kozak and AP-3 as described above. The reamplified K-, 
cDNA was confirmed to have the appropriate size of approximately 450 bp, and was cloned with the TA cloning system, 

20 Invitrogen Inc., into the vector pCRll (Invitrogen, Inc.) according to the manufacturers instructions, and sequenced. 
With reference now to Fig. 5, the nucleotide sequence clearly shows the K^ clone to be flanked by the underlined Kozak 
primer 20 at the 5' end and the underlined AP-3 primer 21 at the 3' end as expected. The 5' end of this partial cDNA 
is identified in Seq. ID No. 18, and the 3' end of this cDNA is identified in Seq. ID No. 19. This partial sequence is an 
open reading frame, and a search of the gene databases EMBO and Genbank has revealed the translated amino acid 

25 sequence from the 3' portion of K^ to be homologous to the ubiquitin conjugating enzyme family (UBC enzyme). The 
translated amino acid sequence of the 3' portion of K^ is 100% identical to a UBC enzyme from D. melanogaster an6 
75 % identical to the UBC-4 enzyme and 79% identical to the UBC-5 enzyme from the yeast S. saccharomyces; and 
75% identical to the UBC enzyme from Arabidopsis thaliana. The K^ clone may contain the actual 5* end of this gene, 
otherwise the Kozak primer hybridized just after the 5' end. This result demonstrates that the method according to the 

30 invention can be used to clone the 5' coding sequence of a gene 

Use 

[0079] The method according to the invention can be used to identify, isolate and clone mRNAs from any number of 
35 sources. The method provides for the identification of desirable mRNAs by simple visual inspection after separation, 
and can be used for investigative research, industrial and medical applications. 

[0080] For instance, the reamplified cDNAs can be sequenced, or used to screen a DNA library in order to obtain 
the full length gene. Once the sequence of the cDNA is known, amino acid peptides can be made from the translated 
protein sequence and used to raise antibodies. These antibodies can be used for further research of the gene product 
40 and its function, or can be applied to medical diagnosis and prognosis. The reamplified cDNAs can be cloned into an 
appropriate vector for further propagation, or cloned into an appropriate expression vector in order to be expressed, 
either in vitro or in vivo. The cDNAs which have been cloned into expression vectors can be used in industrial situations 
for overproduction of the protein product. In other applications the reamplified cDNAs or their respective clones will be 
used as probes for in situ hybridization. Such probes can also be used for the diagnosis or prognosis of disease. 

45 

Other Embodiments 

[0081] Other embodiments are within the following claims. 

[0082] The length of the oligodeoxynucleotide can be varied dependent upon the annealing temperature chosen. In 
50 the preferred embodiments the temperature was chosen to be A2°0 and the oligonucleotide primers were chosen to 
be at least 9 nucleotides in length. If the annealing temperature were decreased to ZS'^C then the oligonucleotide 
lengths can be decreased to at least 6 nucleotides in length. 

[0083] The cDNA could be radiolabelled with radioactive nucleotides other than 35$, such as 32p and 33p. When 
desired, non-radioactive imaging methods can also be applied to the method according to the invention. 
55 [0084] The amplification of the cDNA could be accomplished by a temperature cycling polymerase chain reaction, 
as was described, using a heat stable DNA polymerase for the repetitive copying of the cDNA while cycling the tem- 
perature for continuous rounds of denaturation, annealing and extension. Or the amplification could be accomplished 
by an isothermal DNA amplification method (Walker a/., 1992, Proc. Natl. Acad. Sc/., Vol. 89, pp. 392-396). The 
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isothermal amplification method would be adapted to use for amplifying cDNA by including an appropriate restriction 
endonuclease sequence, one that will be nicked at hemiphosphorothloate recognition sites and whose recognition site 
can be regenerated during synthesis with a^^S labelled dNTPs. 

[0085] Proteins having similar function or similar functional domains are often referred to as being part of a gene 

5 family. Many such proteins have been cloned and identified to contain consensus sequences which are highly con- 
served amongst the members of the family. This conservation of sequence can be used to design oligodeoxynucleotide 
primers for the cloning of new members, or related members, of a family. Using the method of the invention the mRNA 
from a cell can be reverse transcribed, and a cDNA could be amplified using at least one primer that has a sequence 
substantially identical to the sequence of a mRNA of known sequence. Consensus sequences for at least the following 

10 families and functional domains have been described in the literature: protein tyrosine kinases (Hanks et ai, 1991, 
Methods on Enzymology, Vol. 200, pp. 38-81 ; Wilks, 1991 , Methods in Enzymology, Vol. 200, pp. 533-546); homeobox 
genes; zinc-finger DNA binding proteins (Miller a/, 1985, EMBOJour,, Vol. 4, pp. 1609-1614); receptor proteins; the 
signal peptide sequence of secreted proteins; proteins that localize to the nucleus (Guiochon-Mantel et aL, 1989, Vol. 
57, pp. 1147-1154); serine proteases; inhibitors of serine proteases; cytokines; the SH2 and SH3 domains that have 

15 been described in tyrosine kinases and other proteins (Pawson e( a/., 1 992, Cell, Vol 71 , pp. 359-362); serine/threonine 
and tyrosine phosphatases (Cohen, 1991 , Methods in Enzymology, Vol. 201 , pp. 398-408); cyclins and cycl in-depend- 
ent protein kinases (CDKs) (see for ex., Keyomarsi etaL, 1993, Proc, Natl. Acad. ScL, USA, Vol. 90, pp. 1112-1116). 
[0086] Primers for any consensus sequence can readily be designed based upon the codon usage of the amino 
acids. The incorporation of degeneracy at one or more sites allows the designing of a primer which will hybridize to a 

20 high percentage, greater than 50%, of the mRNAs containing the desired consensus sequence. 

[0087] Primers for use in the method according to the invention could be designed based upon the consensus se- 
quence of the zinc finger DNA binding proteins, for example, based upon the amino acid consensus sequence of the 
proteins PYVC. Useful primers for the cloning of further members of this family can have the following sequences: 
5'-GTAYGCNTGT [Seq. ID. No. 20] or 5'-GTAYGCNTGC [Seq. ID. No. 21 ), in which the Y refers to the deoxynucleotides 

25 dT or dC for which the primer is degenerate at this position, and the N refers to inosine (T). The base inosine can pair 
with all of the other bases, and was chosen for this position of the oligodeoxynucleotide as the codon for valine "V" is 
highly degenerate in this position. The described oligodeoxynucleotide primers as used will be a mixture of 5'-GTAT- 
GCITGT and 5'-GTACGCITGT or a mixture of 5'-GTATGCITGC and 5'-GTACGCITGC. 

30 SEQUENCE LISTING 

[0088] 

(1) GENERAL INFORMATION: 

35 

(i) APPLICANT Liang, Peng 
Pardee, Arthur B. 

(li) TITLE OF INVENTION: Identifying, Isolating and Cloning Messenger RNAs 
(III) NUMBER OF SEQUENCES; 21 
40 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Choate, Hall & Stewart 

(B) STREET: Exchange Place, 53 State Street 

(C) CITY: Boston 

45 (D) STATE: Massachusetts 

(E) COUNTRY: U.S.A. 

(F) ZIP: 02190 

(V) COMPUTER READABLE FORM: 

50 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 

55 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: US 
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(B) FILING DATE: 11-MAR-1993 

(C) CLASSIFICATION: 

(vil) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/850.343 

(B) FILING DATE: 11-MAR-1992 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Pasternack, Sam 

(B) REGISTRATION NUMBER: 29.576 

(C) REFERENCE/DOCKET NUMBER: DFCI234CIP 

(Ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 617 227-5020 

(B) TELEFAX: 617 227-7566 

(2) INFORMATION FOR SEQ ID N0:1: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



ri ' mTrrrr tvn 



(2) INFORMATION FOR SEQ ID N0:2: 
' (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 



rx ' rrriTXTX ' ttv 



(2) INFORMATION FOR SEQ ID N0:3: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 13 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 
rx ' rx ' rriTiT vnn 

(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

IIKi'i'l!ATrMN 

(2) INFORMATION FOR SEQ ID N0:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 

(2) INFORMATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 

5 

(2) INFORMATION FOR SEQ ID N0:7: 
(1) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

20 

(2) INFORMATION FOR SEQ ID N0:8: 
25 (j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

GCCACCATGG 10 

40 (2) INFORMATION FOR SEQ ID N0:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 260 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
50 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 



55 
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CTTGATTGCC TCCTACAGCA GTTGCAGGCA CCTmCCTG TACCATGAAG TTCACAOTCC $0 
GGGATTOT G A CCCTAAXACT GGAGITCCAG ATGAAGHTGG ATAT6ATGAT GAATATCTGC 120 
^ TGGAAGATCT TGAGGTAACT GTGTCIGATC ATATTCAGAA GATACTAAAA CCTAACTTCG 180 

CT G CTGCCTG GGAAGAGGTG GGAGGAGCAG CTGCGACAGA GCGTCCTCTT CACAGAGGGG 240 
TCCT66GTGA AAAAAAAAAA 260 

10 (2) INFORMATION FOR SEQ ID NO:10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: other nucleic acid 
20 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



25 



30 



35 



40 



45 



TCA 13 



(2) INFORMATION FOR SEQ ID N0:11: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:11: 

GTTQATXOCC 10 

(2) INFORMATION FOR SEQ ID N0:12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 
55 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12: 
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GCCRCCATGG 

(2) INFORMATION FOR SEQ ID N0:13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(iil) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:13: 
A6CCAGCQAA 

(2) INFORMATION FOR SEQ ID N0:14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:14: 
GACCGCTTGT 

(2) INFORMATION FOR SEQ ID N0:15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:15: 
AGGTQACCGT 

(2) INFORMATION FOR SEQ ID N0:16: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 
GGTACTCCAC 

(2) INFORMATION FOR SEQ ID N0:17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:17: 

GTTGCGATCC 

(2) IMFOKMATIOM FOR SEQ ID N0:18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:18: 

GCCGCCA.TGG CTCTGAAGAG AATCCACAAG CSACACCCATO AA 

(2) INFORMATION FOR SEQ ID N0:19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ lb N0:19: 



GTT6CATTTA CAACAAGAAT TTAtCATCCA AATATTAACA 6XAATOGCAG CATTTCTCTT 60 
GAXAUmAC GGTCACCT 78 

(2) INFORMATION FOR SEQ ID NO:20: 
(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

GTAYGCffTaT 

(2) INFORMATION FOR SEQ ID N0:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 

GTAYGCtaTGC XO 



Claims 

1. A method for isolating a DNA complementary to a mRNA in a nucleic acid sample comprising the steps of: 

a) contacting the sample with a first oligonucleotide primer under conditions in which said first primer hybridizes 
with any mRNA at a first site having a complementary base sequence; 

b) reverse transcribing the mRNA using a reverse transcriptase and said first primer to produce a first DNA 
strand complementary to at least a portion of the mRNA upstream from said first site; 

c) contacting the first DNA strand with a second oligodeoxynucleotide primer under conditions in which said 
second primer hybridizes with complementary DNA at a second site; 

d) extending the second primer using a DNA polymerase to produce a second DNA strand complementary to 
the first DNA strand downstream from said second site; and 

e) amplifying the first and second DNA strands using a polymerase, said first primer and said second primer 
to form the complementary DNA; wherein: 

i) said first primer hybridizes with mRNA at a site that includes a polyA signal sequence; and/or 
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ii) said first primer hybridizes at a portion of the polyadenosine (potyA) tail of said mRNA and at least one 
non-polyA nucleotide imnnediately upstream of said portion; and/or 

iii) said first primer hybridizes at a site including a sequence immediately upstream of a first A ribonucle- 
otide of the mRNAs polyA tail; and/or 

5 iv) said second primer with a base sequence of at least 6 nucleotides and containing an arbitrary sequence 

and a base sequence containing a Kozak sequence. 

2. The method of claim 1 , wherein said first primer: 

10 a) hybridizes with mRNA that includes at least two nucleotides upstream from and adjacent to the first A 

ribonucleotide of the polyA tail; 

b) includes at least 13 nucleotides; 

c) includes a polyA-complementary region comprising at least 11 nucleotides and, upstream from said polyA- 
complementary region, a non-polyA complementary region comprising at least one nucleotide optionally the 

15 non-polyA complementary region comprising at least 2 contiguous nucleotides. 

3. The method of claim 2c), wherein said non-polyA-complementary region comprises 3'-NV, wherein V is one of 
deoxyadenosine, deoxycytidine, or deoxyguanosine, and N is one of deoxyadenosine, deoxycytidine, deoxygua- 
nosine, or deoxythymidine. 

20 

4. The method of claim la), wherein said first primer comprises at least 6 deoxyribonucleotides. 

5. The method of any one of claims 1 to 4, wherein 

25 a) said second primer comprises at least 6 deoxyribonucleotides; or 

b) said second primer includes a randomly selected nucleotide sequence; or 

c) said first or the second primer includes deoxyadenosine, deoxycytidine, deoxyguanosine, and deoxythymi- 
dine; or 

d) said first or second primer includes a restriction endonuclease recognition sequence; or 

30 e) said second primer includes a sequence identical to a sequence contained within a mRNA of known se- 

quence; or 

f) at least one of said first or second pnmers comprises a plurality of oligodeoxynucleotides. 



35 



6. A method according to claim 1 , comprising: 



contacting the mRNA with the first primer under conditions in which said first primer hybridizes with mRNA at 
a site, 

reverse transcribing the mRNA using the reverse transcriptase and said first primer, to produce the first DNA 
strand. 

40 contacting the first DNA strand with the second primer under condifions in which said second primer hybridizes 

with the first DNA strand at the second site, which includes a Kozak sequence, 

extending the second primer using a DNA polymerase to produce a second DNA strand complementary to 
the first DNA strand downstream from the site of hybridization of said second primer with said first DNA strand, 
and 

45 amplifying the first and second DNA strands using a DNA polymerase and said first and second primers. 

7. The method of claim 6, wherein said first primer includes a sequence substantially identical to a sequence contained 
within an mRNA of known sequence. 

50 8. The method according to any one of the preceding claims, wherein 

a) said first primer comprises at least 9 deoxyribonucleotides, optionally at least 10 deoxyribonucleotides; or 

b) said second primer comprises at least 9 deoxyribonucleotides, optionally at least 10 deoxyribonucleotides; 
or 

55 c) said first primer includes a selected arbitrary sequence of deoxyribonucleotides; or 

d) said first primer or said second primer includes a restricfion endonuclease recognition sequence; or 

e) at least one of said first or second primers comprises a plurality of oligodeoxynucleotides. 
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9. A method of comparing the presence or level of individual mRNA molecules in two or more nucleic acid samples, 
comprising the steps of: 

a) providing a first nucleic acid sample including mRNA molecules and performing the method of any one of 
5 claims 1 to 9 thereupon to produce a first population of amplification products comprising the complementary 

DNA; 

b) providing a second nucleic acid sample including mRNA molecules and performing the method of any one 
of claims 1 to 9 thereupon to produce a second population of amplification products comprising the comple- 
mentary DNA; 

10 c) comparing the presence or level of individual amplification products in the first and second populations of 

amplification products. 

10. The method of claim 9, wherein: 

15 a) said first nucleic acid sample comprises mRNAs expressed in a first cell and said second nucleic acid 

sample consists of mRNAs expressed in a second cell; or 

b) said first nucleic acid sample comprises mRNAs expressed in a cell at a first developmental stage and said 
second nucleic acid sample comprises mRNAs expressed in said cell at a second developmental stage. 

20 11. The method of claim 9 or 1 0, wherein said first primer includes a polyA-complementary region comprising at least 
11 nucleotides and, immediately downstream from said polyA-complementary region, a non-polyA-complementary 
region comprising at least one nucleotide, and optionally said polyA-complementary region comprises at least 11 
contiguous thymidines. 

25 12. The method of claim 11, wherein said first primer comprises at least 13 nucleotides. 

13. The method of claim 9. wherein the nucleotide sequence of said first or said second primer contains a restriction 
endonuclease recognition site, and optionally at least one of said first or second primers comprises a plurality of 
oligodeoxynucleotides, and further optionally said plurality of oligonucleotides comprises a plurality of oligodeox- 

30 ynucleotide molecules having the same nucleotide sequence, or individual oligodeoxynucleotide molecules in said 

plurality of oligodeoxynucleotides have different nucleotide sequences. 

14. The method according to any one of claims 9 to 13, further comprising the step of detecting a difference in the 
presence or level of an individual amplification product in said first population of amplification products as compared 

35 with said second population of amplification products. 

15. The method according to any one of claims 9 to 14, wherein the amplifying steps each comprise performing a 
polymerase chain reaction in which the concentration of dNTPs is at or below approximately 20^iM, and/or the 
amplifying steps each comprise performing a polymerase chain reaction in which the concentratran of dNTPs is 

40 approximately 2\iM. 

16. The method according to any one of claims 9 to 15, wherein the step of comparing comprises resolving each of 
said first and second populations of amplification products by gel electrophoresis and comparing the presence or 
level of bands of particular sizes. 

45 

17. The method according to any one of claims 9 to 16, wherein said first cell comprises a tumorigenic cell and said 
second cell comprises a normal cell. 

18. The method according to any one of claims 9 to 17, further comprising a step of cloning individual amplification 
50 products from said first or second populations of amplification products. 

19. The method according to any one of claims 9 to 18, wherein the second primer that hybridizes to a second site in 
said first and second samples hybridizes to the second site which includes NNNRNNATGN. 

55 20. The method according to any one of claims 9 to 19, wherein said first primer has a GC content within the range 
of about 50-70%. 

21 . The method according to any one of claims 9 to 20, wherein the nucleotide sequence of said first primer includes 
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a sequence substantially complementary to a consensus sequence found in a gene family. 

22. The method according to any one of claims 9 to 21 , wherein said first and second sites are separated from one 
another so that at least some amplification products in said first and second populations of amplification products 

5 have a size in the range of approximately 100-500 basepairs. 

23. The method according to claim 14, further comprising a step of isolating an individual amplification product whose 
presence or level differs in said first and second populations of amplification products. 

10 24. The method of claim 23, further comprising a step of cloning said isolated amplification product into a vector. 

25. The method of claim 23 or 24, further comprising either a step of screening a nucleic acid library with said isolated 
amplification product, or a step of determining the nucleotide sequence of at least a portion of said isolated am- 
plification product. 

15 

Patentanspruche 

1. Ein Verfahren zur Isolierung einer DNA, die zu einer mRNA in einer NucleinsSiure-Probe komplementSr ist, um- 
20 fassend die Schritte: 

a) Inkontaktbringen der Probe mit einem ersten Oligonucleotid-Primer unter Bedingungen, bei denen besagter 
erster Primer mit beliebiger m RNA an einer ersten Stelle mit einer komplementaren Basensequenz hybridisiert; 

b) reverses Transkribieren der mRMA unter Venwendung einer reversen Transkriptase und besagten ersten 
25 Primers, um einen ersten DNA-Strang herzustellen, der mindestens zu einem Tei! der mRNA stromaufWarts 

von besagter erster Stelle komplementSr ist; 

c) Inkontaktbringen des ersten DNA-Strangs mit einem zweiten Oligodesoxynucleotid-Primer unter Bedingun- 
gen, bei denen besagter zweiter Primer mit komplemertarer DNA an einer zweiten Stelle hybridisiert; 

d) Extension des zweiten Primers unter Verwendung einer DNA-Polymerase, um einen zweiten DNA-Strang 
30 herzustellen, der zu dem ersten DNA-Strang stromabwSrts von besagter zweiter Stelle komplementar ist; und 

e) Amplifizieren des ersten und zweiten DNA-Strangs unter Venwendung einer Polymerase, besagten ersten 
Primers und besagten zweiten Primers, um die komplementare DNA zu bilden: worin: 

i) besagter erster Primer mit mRMA an einer Stelle hybridisiert, die eine polyA-Signalsequenz mit ein- 
35 schlieflt; und/oder 

ii) besagter erster Primer an einen Teil des Polyadenosin (polyA) Schwanzes von besagter mRNA und 
mindestens an ein nicht-polyA Nucleotid unmittelbar stromaufwSrts von besagtem Teil hybridisiert; und/ 
Oder 

iii) besagter erster Primer an eine Stelle hybridisiert, einschliefilich einer Sequenz unmittelbar stromauf- 
40 warts eines ersten A-Ribonucleotids des polyA Schwanzes der mRNA; und/oder 

iv) besagter zweiter Primer mit einer Basensequenz aus mindestens 6 Nucleotiden und enthaltend eine 
beliebige Sequenz und eine Basensequenz, die eine Kozak-Sequenz enthait. 

2. Das Verfahren nach Anspruch 1 , worin besagter erster Primer: 

45 

a) mit mRNA hybridisiert, die mindestens zwei Nucleotide stromaufwarts von und angrenzend an das erste 
A-Ribonucleotid des polyA-Schwanzes mit einschliefit; 

b) mindestens 13 Nucleotide umfasst; 

c) eine polyA-komplementare Region, umfassend mindestens 11 Nucleotide und, stromaufwarts von besagter 
50 potyA-komplementarer Region, eine nicht-potyA-komplementare Region mit einschliefit, umfassend minde- 
stens ein Nucleotid gegebenenfalls die nicht-polyAkomplementare Region, umfassend mindestens 2 benach- 
barte Nucleotide. 

3. Das Verfahren nach Anspruch 2 c), worin besagte nicht-polyA-komplementare Region 3'-NV umfasst, worin V ein 
55 Desoxyadenosin, Desoxycytidin oder Desoxyguanosin ist und N ein Desoxyadenosin, Desoxycytidin, Desoxygua- 

nosin Oder Desoxythymidin ist. 

4. Das Verfahren nach Anspruch la), worin besagter erster Primer mindestens 6 Desoxyribonucleotide umfasst. 
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5. Das Verfahren nach einem der AnsprOche 1 bis 4, worin 

a) besagter zweiter Primer mindestens 6 Desoxyribonucleotide umfasst; Oder 

b) besagter zweiter Primer eine zufSllig ausgewShlte Nucleotid-Sequenz mit einschliedt; oder 

c) besagter erster oder der zweite Primer Desoxyadenosin, Desoxycytidin, Desoxyguanosin und Desoxythy- 
midin mit einsclilieBt; oder 

d) besagter erster oder zweiter Primer eine Restriktionsendonuclease-Erkennungssequenz mit einschliedt; 
Oder 

e) besagter zweiter Primer eine Sequenz mit einschlieflt, die mit einer Sequenz identisch ist, die in einer mRNA 
bekannter Sequenz enthalten ist; oder 

f) mindestens besagter erster oder zweiter Primer eine Vtelzaht an Oligodesoxynucleotiden umfasst. 

6. Bin Verfahren nach Anspruch 1 , umfassend: 

Inkontaktbringen der mRNA mit dem ersten Primer unter Bedingungen, bei denen besagter erster Primer mit 
mRNA an einer Stelle hybridisiert, 

reverses Transkribieren der mRMA unter Verwendung der reversen Transkriptase und besagten ersten Pri- 
mers, um einen ersten DNA-Strang herzustellen, Inkontaktbringen des ersten DNA-Strangs mit dem zweiten 
Primer unter Bedingungen, bei denen besagter zweiter Primer mit dem ersten DNA-Strang an der zweiten 
Stelle hybridisiert, der eine Kozak-Sequenz mit einschlielit. 

Extension des zweiten Primers unter Verwendung einer DNA-Polymerase, um einen zweiten DNA-Strang 
herzustellen, der zu dem ersten DNA-Strang stromabwSrts von der Stelle, wo besagter zweiter Primer mit 
besagtem ersten DNA-Strang hybridisiert, komplementSr ist, und Amplifizieren des ersten und zweiten 
DNA-Strangs unter Venwendung einer DNA-Potymerase und besagten ersten und zweiten Primers. 

7. Das Verfahren nach Anspruch 6, worin besagter erster Primer eine Sequenz mit einschlielit, die im Wesentlichen 
mit einer Sequenz identisch ist, die in einer mRNA bekannter Sequenz enthalten ist. 

8. Das Verfahren gemaR einem der vorausgehenden AnsprOche, worin 

a) besagter erster Primer mindestens 9 Desoxyribonucleotide, gegebenenfalls mindestens 10 Desoxyribinu- 
cleotide, umfasst; oder 

b) besagter zweiter Primer mindestens 9 Desoxyribonucleotide, gegebenenfalls mindestens 10 Desoxyribinu- 
cleotide, umfasst; oder 

c) besagter erster Primer eine ausgewahlte betiebige Sequenz aus Desoxyribonucieotiden mit einschlielit; 
Oder 

d) besagter erster oder besagter zweiter Primer eine Restriktionsendonuclease-Erkennungssequenz mit ein- 
schlielit; Oder 

e) mindestens besagter erster Primer oder zweiter Primer eine Vielzahl an Oligodesoxynucleotiden umfasst. 

9. Ein Verfahren zum Vergleich des Vorhandenseins oder Levels individueller mRNA-Molekiile in zwei oder mehr 
Nucleinsaure-Proben, umfassend die Schritte: 

a) Bereitstellen einer ersten mRNA-Molekule enthaltenden Nucleinsaure-Probe und DurchfOhrung des Ver- 
fahrens nach einem der AnsprOche 1 bis 9, um eine erste Population Amplifikationsprodukte herzustellen, die 
die komplementare DNA umfassen; 

b) Bereitstellen einer zweiten mRNA-MolekQIe enthaltenden NucleinsSure-Probe und DurchfOhrung des Ver- 
fahrens nach einem der AnsprOche 1 bis 9, um eine zweite Population Amplifikationsprodukte herzustellen, 
die die komplementSre DNA umfassen; 

c) Vergleich des Vorhandenseins oder Levels individueller Amplifikationsprodukte in der ersten und zweiten 
Population Amplifikationsprodukte. 

10. Das Verfahren nach Anspruch 9, worin: 

a) besagte erste Nucleinsaure-Probe mRNAs umfasst, die in einer ersten Zelle exprimiert sind, und besagte 
zweite Nucleinsaure-Probe aus mRNAs besteht, die in einer zweiten Zelle exprimiert sind; oder 

b) besagte erste Nucleinsaure-Probe mRNAs umfasst, die in einer Zelle in einem ersten Entwicklungsstadium 
exprimiert sind, und besagte zweite NucleinsSure-Probe mRNAs umfasst, die in besagter Zelle in einem zwei- 
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ten Entwicklungsstadium exprimiert sind. 

11. Das Verfahren nach Anspruch 9 Oder 10, worin besagter erster Primer eine polyA-komplementare Region, um- 
fassend mindestens 11 Nucleotide und, unmittelbar stromabwarts von besagter polyA-komplementSrer Region, 
eine nicht-poIyA-komplementare Region mit einschtiefit, umfassend mindestens ein Nucleotid, und gegebenenfalls 
besagte polyA-komplementare Region mindestens 11 aneinander liegende Thymidine umfasst. 

12. Das Verfahren nach Anspruch 11, worin besagter erster Primer mindestens 13 Nucleotide umfasst. 

1 3. Das Verfahren nach Anspruch 9, worin die Nucleotid-Sequenz von besagtem ersten oder besagtem zweiten Primer 
eine Restriktionsendonuclease-Erkennungsstelle enthait und gegebenenfalls mindestens besagter erster oder be- 
sagter zweiter Primer eine Vielzaht an Oligodesoxynucleotiden umfasst, und auRerdem gegebenenfalls besagte 
Vielzahl an Oligonucleotiden eine Vielzahl an Oligodesoxynucleotid-Molekulen mit der gleichen Nucleotid-Se- 
quenz umfasst, Oder einzelne Oligodesoxynuoleotid-Molekule in besagter Vielzahl an Oligodesoxynucleotiden ver- 
schiedene Nuclectid-Sequenzen besitzen. 

14. Das Verfahren gemSfi einem der Anspruche 9 bis 13, aulierdem umfassend den Schritt der Detektion einer Dtf- 
ferenz bei Vorhandensein oder Level eines indivlduellen Amplifikationsprodukts in besagter erster Population Am- 
plifikationsprodukte verglichen mit besagter zweiter Population Amplifikationsprodukte. 

15. Das Verfahren gemaii einem der AnsprOche 9 bis 14, worin die Amplifizierungsschritte jeweils eine Durchfuhrung 
einer Polymerase-Kettenreaktion umfassen, bei der die Konzentration an dNTPs bei oder unter ungefahr 20 nM 
liegt, und/oder die Amplifizierungsschritte jeweils eine Durchfuhrung einer Polymerase-Kettenreaktion umfassen, 
bei der die Konzentration an dNTPs ungefShr 2 ^lM ist. 

16. Das Verfahren gemaH einem der AnsprOche 9 bis 15, worin der Vergleichsschritt eine Auftrennung jeweils der 
ersten und zweiten Population Amplifikationsprodukte mittels Gelelektrophorese und Vergleich des Vorhanden- 
seins Oder Levels von Banden bestimmter GrOlien umfasst. 

17. Das Verfahren gemafl einem der Anspruche 9 bis 16, worin besagte erste Zelle eine Tumoren-bildende Zelle 
umfasst und besagte zweite Zelle eine gesunde Zelle umfasst. 

18. Das Verfahren gemaa einem der AnsprOche 9 bis 17, aulierdem umfassend einen Schritt der Klonierung Indivi- 
duelier Amplifikationsprodukte von besagter erster oder zweiter Population Amplifikationsprodukte. 

1 9. Das Verfahren gemad einem der Anspruche 9 bis 1 8, worin der zweite Primer, der an eine zweite Stelle in besagter 
erster und zweiter Probe hybridisiert, an die zweite Stelle, die NNNRNNATGN einschliefit, hybridisiert. 

20. Das Verfahren gemSli einem der AnsprOche 9 bis 19, worin besagter erster Primer einen GC-Gehalt im Bereich 
von etwa 50-70% besitzt. 

21. Das Verfahren gemafl einem der AnsprOche 9 bis 20, worin die Nucleotid-Sequenz von besagtem ersten Primer 
eine Sequenz mit einschliefJt, die zu einer in einer Genfamilie gefundenen Consensus-Sequenz komplementSr ist, 

22. Das verfahren gemafi einem der AnsprOche 9 bis 21 , worin besagte erste und zweite Stelle voneinander getrennt 
sind, so dass mindestens einige Amplifikationsprodukte in besagter erster und zweiter Population Amplifikations- 
produkte eine GrOfie im Bereich von ungefahr 100-500 Basenpaare besitzen. 

23. Das Verfahren, gemafl Anspruch 14, auflerdem umfassend einen Schritt der Isoiierung eines indivlduellen Ampli- 
fikationsprodukts, dessen Vorhandensein oder Level in besagter erster und zweiter Population Amplifikationspro- 
dukte unterschiedlich isL 

24. Das Verfahren nach Anspruch 23, aufierdem umfassend einen Schritt der Klonierung besagten isolierten Ampli- 
fikationsprodukts in einen Vektor. 

25. Das Verfahren nach Anspruch 23 oder 24, aulierdem umfassend entweder einen Schritt der Durchmusterung einer 
Nucleinsaure-Bibliothek mit besagten isolierten Amplifikationsprodukt oder einen Schritt der Bestimmung der Nu- 
cleotid-Sequenz mindestens eines Teils von besagtem isolierten Amplifikationsprodukt. 
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Revendi cat ions 

1. Proc6d6 pour isoler un ADN compl6mentaire d'un ARNm dans un 6chantillon d'acide nucl6ique, comprenant les 
stapes consistant d : 

5 

a) mettre en contact I'^chantillon avec une premiere amorce oligonucl6otidique dans des conditions dans 
lesquelles ladite premiere amorce s'hybride avec un ARNm quelconque ^ un premier site poss6dant une 
sequence de bases compl6mentaires ; 

b) effectuer la transcription inverse de I'ARNm en utilisant une transcriptase inverse at ladite premiere amorce 
10 pour produire un premier brin d'ADN compl6mentaire d'au moins une portion de I'ARNm en amont dudit premier 

site ; 

c) mettre en contact le premier brin d'ADN avec une seconde amorce oligod6soxynucl6otidique dans des 
conditions dans lesquelles la seconde amorce s'hybride avec un ADN compl6mentaire d un second site ; 

d) allonger la seconde amorce en utilisant une ADN polymerase pour produire un second brin d'ADN compl6- 
15 mentaire du premier brin d'ADN en aval dudit second site ; et 

e) amplifier les premier et second brins d'ADN en utilisant une polymerase, ladite premiere amorce et ladite 
seconde amorce, pour former I'ADN compl6mentaire ; dans lequel : 

i) ladite premiere amorce s'hybride d I'ARNm S un site qui contient une sequence signal polyA ; et/ou 
20 ii) ladite premiere amorce s'hybride d une portion de la queue polyad6nosine (polyA) dudit ARNm et au 

moins un nucleotide non polyA juste en aval de ladite portion ; et/ou 

iti) ladite premiere amorce s'hybride ^ un site contenant une sequence imm6diatement en aval d'un premier 
ribonucleotide A de la queue polyA des ARNm ; et/ou 

iv) ladite seconde amorce s'hybride avec une sequence de base d'au moins 6 nucleotides et contenant 
25 une sequence arbitraire et une sequence de base contenant une sequence Kozak. 

2. Precede selon la revendication 1, dans lequel ladite premiere amorce : 

a) s'hybride avec un ARNm qui contient au moins deux nucleotides en amont du premier ribonucleotide A de 
30 la queue polyA, et adjacent d ce dernier ; 

b) contient au moins 13 nucleotides ; 

c) contient une region compiementaire de poIyA comprenant au moins 11 nucleotides et, en amont de ladite 
region compiementaire de polyA, une region non compiementaire de polyA comprenant au moins un nucleo- 
tide, la region non compiementaire de polyA comprenant 6ventuellement au moins 2 nucleotides contigus. 

35 

3. Procede selon la revendication 2c), dans lequel ladite region non compiementaire de polyA comprend 3'-NV, ou 
V est un residu desoxyadenosine, desoxycytidine ou d6soxyguanosine, et N est un residu desoxyadenosine, de- 
soxycytidine, d6soxyguanosine ou desoxythymidine. 

40 4. Precede selon la revendication la), dans lequel ladite premiere amorce comprend au moins 6 desoxyribonucieo- 
tides. 

5. Precede selon I'une quelconque des revendications 1 d 4, dans lequel 

45 a) ladite seconde amorce comprend au moins 6 desoxyribonucieotides ; ou 

b) ladite seconde amorce contient une sequence nucieotidique choisie au hasard ; ou 

c) ladite premiere ou seconde amorce contient un residu desoxyadenosine, desoxycytidine, desoxyguanosine 
ou desoxythymidine ; ou 

d) ladite premiere ou seconde amorce contient une sequence de reconnaissance d'une endonuciease de 
50 restriction ; ou 

e) ladite seconde amorce contient une sequence identique d une sequence contenue dans un ARNm de 
sequence connue ; ou 

f) au moins une desdites premiere ou seconde amorces comprend une pluralite d'oligodesoxynucieotides. 

55 6. Precede selon la revendication 1 , comprenant les etapes consistant d: 

mettre en contact I'ARNm avec la premiere amorce dans des conditions dans lesquelles ladite premiere amor- 
ce s'hybride avec I'ARNm e un site ; 
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effectuer la transcription inverse de I'ARNm en utilisant la transcriptase inverse et ladite premiere amorce, 
pour produire Is premier brin d'ADN ; 

mettre en contact le premier brin d'ADN avec la seconde amorce dans des conditions dans lesquelles la 
seconde amorce s'hybride avec le premier brin d'ADN au second site, ce qui inclut une s6quence Kozak ; 
5 - allonger la seconde amorce en utilisant une ADN polymerase pour produire un second brin d'ADN compl6- 

mentaire du premier brin d'ADN en aval du site d'hybridation de ladite seconde amorce avec ledit premier brin 
d'ADN ; et 

amplifier les premier et second brins d'ADN en utilisant une ADN polymerase et lesdites premiere et seconde 
amorces. 

10 

7. Proc6d6 selon ta revendication 6, dans lequel ladite premiere amorce contient une s6quence sensiblement iden- 
tique ^ une s6quence contenue dans un ARNm de sequence connue. 

8. Precede selon I'une quelconque des revendications precedentes, dans lequel 

15 

a) ladite premiere amorce comprend au moins 9 desoxyribonucieotides, eventuellement au moins 10 
desoxyribonucieotides ; ou 

b) ladite seconde amorce comprend au moins 9 desoxyribonucieotides, eventuellement au moins 10 
desoxyribonucieotides ; ou 

20 c) ladite premiere amorce contient une sequence choisie arbitrairement de desoxyribonucieotides ; ou 

d) ladite premiere amorce ou ladite seconde amorce contient une sequence de reconnaissance d'une endo- 
nuciease de restriction ; ou 

e) au moins une desdites premiere ou seconde amorces comprend une pluralite d'oligodesoxynucieotides. 

25 9. Procede de comparaison de la presence ou du taux de molecules d'ARNm individuelles dans au moins deux 
echantillons d'acides nucieiques, comprenant les etapes consistent d : 

a) se procurer un premier 6chantilIon d'acide nucieique contenant des molecules d'ARNm et le soumettre au 
precede selon I'une quelconque des revendications 1 d 9 afin de produire une premiere population de produits 

30 d'amplification comprenant I'ADN com pte mental re ; 

b) se procurer un second echantillon d*acide nucl6ique contenant des molecules d'ARNm et le soumettre au 
precede selon I'une quelconque des revendications 1 9 afin de produire une seconde population de produits 
d'amplification comprenant I'ADN compiementaire ; 

c) comparer la presence ou le taux des produits d'amplification individuels dans les premiere et seconde 
35 populations de produits d'amplification. 

10. Precede selon la revendication 9, dans lequel ; 

a) ledit premier echantillon d'acide nucieique comprend des ARNm exprimes dans une premiere cellule et 
40 ledit second echantillon d'acide nucieique se compose d'ARNm exprimes dans une seconde cellule ; ou 

b) ledit premier echantillon d'acide nucieique comprend des ARNm exprimes dans une cellule ^ un premier 
stade de developpement et ledit second echantillon d'acide nucieique comprend des ARNm exprimes dans 
ladite cellule e un second stade de developpement. 

45 11. Precede selon la revendication 9 ou 10, dans lequel ladite premiere amorce contient une region compiementaire 
de polyA comprenant au moins 11 nucleotides et, juste en aval de ladite region compiementaire de polyA, une 
region non compiementaire de polyA comprenant au moins un nucleotide, et eventuellement ladite region com- 
piementaire de polyA comprend au moins 11 r6sidus thymidine contigus. 

50 12. Precede selon la revendication 11, dans lequel ladite premiere amorce comprend au moins 13 nucleotides. 

13. Precede selon la revendication 9, dans lequel la sequence nucieotldlque de ladite premiere ou seconde amorce 
contient un site de reconnaissance d'une endonuciease de restriction et, eventuellement, au moins une desdites 
premiere ou seconde amorces comprend une pluralite d'oligodesoxynucieotides, et eventuellement aussi ladite 
55 pluralite d'ollgonucieotides comprend une pluralite de molecules d'oligodesoxynucieotide ayant la mdme sequence 

nucieotidique, ou les molecules individuelles d'oligodesoxynucieotide dans ladite pluralite d'oligodesoxynucieoti- 
des ont differentes sequences nucieotidlques. 
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14. Proc6cl6 selon Tune quelconque des revendications 9^13, comprenant en outre I'^tape consistant d d6tecter une 
difference dans la presence ou le taux d'un produit d'amplification individuel dans ladite premiere population de 
produits d'amplification compar6e d ladite seconde population de produits d'amplification. 

15. Proc6d6 selon I'une quelconque des revendications 9 ^ 14, dans lequel les 6tapes d'amplification comprennent 
chacune le fait de r^aliser une reaction en chaine par polymerase dans laquelle la concentration des dNTP est 
inf6rieure ou 6gale ^ environ 20^M, et/ou les 6tapes d'amplification comprennent chacune le fait de r6aliser une 
reaction en chaine par polymerase dans laquelle la concentration des dNTP est d'environ 2^iM, 

16. Proc6de selon I'une quelconque des revendications 9^15, dans lequel retape de comparaison comprend la 
resolution de chacune desdites premiere et seconde populations de produits d'amplification par eiectrophorese 
sur gel et la comparaison de la presence ou du taux de bandes de tallies particulieres. 

17. Proc6d6 selon I'une quelconque des revendications 9^16, dans lequel ladite premiere cellule comprend une 
cellule oncogene et ladite seconde cellule comprend une cellule normale. 

18. Procede seton I'une quelconque des revendications 9 d 1 7, comprenant en outre une etape de clonage de produits 
d'amplification individuels e partir desdites premiere ou seconde populations de produits d'amplification. 

19. Precede selon I'une quelconque des revendications 9 e 18, dans lequel la seconde amorce qui s'hybride ^ un 
second site dans lesdits premier et second echantillons s'hybride au second site qui contient NNNRNNATGN. 

20. Procede selon I'une quelconque des revendications 9 e 19, dans lequel ladite premiere amorce a une teneur en 
GC dans la gamme d'environ 50-70%. 

21. Procede selon I'une quelconque des revendications 9 d 20, dans lequel la sequence nucieotidique de la dite pre- 
miere amorce contient une sequence sensiblement compiementaire d'une sequence consensus trouvee dans une 
famille de genes. 

22. Precede selon Tune quelconque des revendications 9^21, dans lequel lesdits premier et second sites sont separes 
I'un de I'autre de telle sorte qu'au moins certains produits d'amplification dans lesdites premiere et seconde po- 
pulations de produits d'amplification aient une taille dans la gamme d'environ 100-500 paires de bases. 

23. Precede selon la revendication 14, comprend en outre une etape d'isolement d'un produit d'amplification individuel 
dent la presence ou le taux differe dans lesdites premiere et seconde populations de produits d'amplification. 

24. Precede selon la revendication 23, comprenant en outre une etape de clonage dudit produit d'amplification isoie 
dans un vecteur. 

25. Precede selon la revendication 23 ou 24, comprenant en outre soit une etape de criblage d'une banque d'acides 
nucieiques avec ledit produit d'amplification isoie, soit une etape de determination de la sequence nucieotidique 
d'au moins une portion dudit produit d'amplification isoie. 
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GCCACCATGG CTCTGAAGAGAATCCACAAGGACACCCATGAA 
Kozak 



GTTGCATTTACAACAAGAA 

TTTATCy^TCCAAATATTAACAGTAATGGCAGCArrTGTCTTGATATTCTACGGTCACC 3 ' 

3^ TGCCAGTGGA '5' 
AP-3 

FIG. 5 
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